Astronomy & Astrophysics manuscript no. 
(will be inserted by hand later) 



The zero-crossing scale of the galaxy correlation function and 

the problem of galaxy bias 



in 
o 
o 



(N 



> 

o 
o 

o 



Francesco Sylos Labini^'^ 



^ "Enrico Fermi Center", Via Panisperna 89 A, Compendio del Viminale, 00184 Rome, Italy 
^ "Istituto dei Sistemi Complessi" CNR, Via dei Taurini 19, 00185 Rome, Italy 

Received / Accepted 

Abstract. One of the main problems in the studies of large scale galaxy structures concerns the relation of the 
correlation properties of a certain population of objects with those of a selected subsample of it, when the selection 
is performed by considering physical quantities like luminosity or mass. I consider the case where the sampling 
is defined as in the simplest thresholding selection scheme of the peaks of a Gaussian random field as well as 
the case of the extraction of point distributions in high density regions from gravitational N-body simulations. 
I show that an invariant scale under sampling is represented by the zero-crossing scale of ^(r). By considering 
recent measurements in the 2dF and SDSS galaxy surveys I note that the zero-point crossing length has not yet 
been clearly identified, while a dependence on the finite sample size related to the integral constraint is manifest. 
I show that this implies that other length scales derived from ^ (r) are also affected by finite size effects. I discuss 
the theoretical implications of these results, when considering the comparison of structures formed in N-body 
simulations and observed in galaxy samples, and different tests to study this problem. 
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1. Introduction 

The problem of "sampling" discrete and continuous dis- 
tributions is a central one in studies of cosmological den- 
sity fields and particularly of galaxy structures. By sam- 
pling I mean the operation performed when one extracts, 
from a given distribution, a subsample of it by making 
a selection on a certain parameter /i. For example, one 
can make such type of selection by extracting from the 
whole population of galaxies of all luminosity, only those 
objects whose luminosity is brighter than a given thresh- 
old. A similar selection can be done by considering galaxy 
color. Alternatively one may consider a certain density 
field, continuous or discrete, where the fluctuation field is 
a stochastic variable of position (for example a Gaussian 
fluctuation field), and one may sample the distribution by 
extracting fluctuations larger than a given threshold in the 
density fluctuation. 

In general the problem consists in the understanding 
the relations between the statistical properties of the "bi- 
ased" distribution with the original one, particularly of 
the two-point correlation function ^(r; fi > fl) (where fl 
is the threshold) of the sampled field with the original 
^(r;/i). The interest, for instance, lies in the fact that in 
the studies of galaxy samples, one has to perform a sam- 
pling when measuring the two-point correlation function. 
In the comparison of observation with theoretical models 



the sampling procedure plays a crucial role in the deter- 
mination of the physics of the system. In fact, in the anal- 
ysis of cosmological N-body simulations one also needs 
to extract subsamples of points which, according to some 
models, would represent galaxies instead of dark matter 
particles. In these contexts, the simplest theoretical model 
describing biasing (introduced by Kaiser 1984) is not able 
to take into account the effects related to strong cluster- 
ing, as it was developed for a continuous Gaussian field, 
and thus it does not represent an useful analytical treat- 
ment of the problemof strong clustering, which is instead 
the relevant one for galaxy structures. We show however 
that an important feature of this model is preserved also 
in cases where strong clustering in point distributions is 
present. 

It is very difficult to treat the problem of sampling for 
a generic case. What one can do realistically is to consider 
a certain point distribution, with given correlation prop- 
erties and a certain sampling procedure and then look for 
invariant quantities under sampling, such as characteris- 
tic length scales which are unaffected by sampling. This is 
the strategy I am going to consider in this paper. 

In this paper I firstly briefly review (Sec. 2) the effect of 
sampling in the simplest model of a correlated Gaussian 
density field. In Sec. 3 I show that for the case of a Cold 
Dark Matter (CDM) type model such a sampling does not 
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change the intrinsic length scale defined by f (r^p; /i) = 0, 
while other length scales are affected, in a linear or non- 
linear way depending on scales and amplitudes. I then 
consider in Sec. 4 particle distributions obtained from cos- 
mological N-body simulations extracted in such a way to 
represent large amplitude fluctuations ultimately associ- 
ated to galaxies in some models. I show that also in this 
case the scale r^p remains invariant under sampling, while, 
for example the scale such that ^{vq; fi > fl) — 1 changes as 
a function of the threshold fl. An important point related 
to finite sample measurements of the correlation function 
is discussed in Sec. 5: that is the problem of the determina- 
tion of the zero-point in relation to the estimators of ^(r) 
and the finite-size effects which may artificially force the 
correlation function to cross zero, even when the underly- 
ing distribution, in the ensemble sense, has, for example, 
only positive correlations: In this case the scale Vzp is a 
finite size effect. I consider in Sec. 6 the observational sit- 
uation, also in the light of the recent results of Eisenstein 
et al. (2005) on a very large and deep sample of the Sloan 
Digital Sky Survey (SDSS). I discuss the fact that in dif- 
ferent galaxy samples the length scale r^p is not found to 
be stable, varying from 20 Mpc/h in the CfAl catalog to 
about 120 Mpc/h in the SDSS data. The conclusions are 
discussed in Sec. 7: I find that, contrary to the theoretical 
CDM case and to results in N-body simulations, obser- 
vational evidences support the finite-size interpretation of 
the zero-point crossing scale of the estimated ^(r; fi > fl). 
The case for such a variation can be directly clarified by 
studying the conditional average density. I then discuss 
the implications concerning other length-scales measured 
by the estimated correlation function, such as the scale 
where ^(ro;/i > /i) = 1, concluding that, in galaxy sam- 
ples, finite size effects may play the dominant role for their 
determination. Finally I discuss some direct tests to clarify 
the situation. 



2. Sampling a Gaussian random field 

Let us now discuss the simplest biasing scheme of a con- 
tinuous and correlated density field, introduced by Kaiser 
(1984). Suppose to have a Gaussian random field with cor- 
relations described by ^(r; /i) and such that the variance is 
(yti^) = cr^ (where /i is the mean density normalized fluctu- 
ation) . One can identify fluctuations of the field such that 
they are larger than v times the variance. This selection 
defines a biased field with equal weight: if the fluctua- 
tions of the original field are smaller than fl = va and 1 
if they are equal or larger than fi. When one changes the 
threshold v one selects different regions of the underly- 
ing Gaussian random field, corresponding to fluctuations 
of differing amplitudes. The reduced two-point correlation 
function of the selected objects is then that of the peaks 
£,{r; fl > fi), which is enhanced with respect to that of 
the underlying density field f (r; /^) (normalized to cr^). 



One may compute the following first-order approximation 
IIDurrer et al. 200311 



C(r;^ > A) 




exp v 



C{r; /i) 
i + e(7-;/^) 



1, (1) 



which reduces to £,{r; /i > /i) ~ v'^£,{r\ /i) when J^^^(r; /i) ^ 
1. Thus, if present in the underlying distribution, the char- 
acteristic length scale r^p is not changed under this selec- 
tion procedure, i.e. 



S,{rzp\ A*) = ii^zp] n> fl)=0 yfl 



(2) 



On the other hand for f (r; fi > fl) > 1 the amplifi- 
cation is non-linear as a function of scale: this means 
that the functional behavior of ^(r; /i > /i) is different 
from the one of ^(r; /x) in the regime where ^(r; /i > 
fi) > 1. In addition the scale such that ^(ro;/i > 
fi) ~ 1 changes in a non-linear way as a function 
of the threshold ( |Gabrielli, Sylos Labini fc Durrer 2000} 
IDurrer et al. 2008|l. 

3. Sampling a CDM type density field 

I discuss now the effect of the previous biasing scheme on 
a cosmological relevant density field. It has been discussed 
in Gabrielli, Joyce & Sylos Labini (2002) that main fea- 
tures of correlated (Gaussian) density fields in standard 
cosmological models can be captured by the following be- 
havior of the power spectrum of mean density normalized 
density fluctuations P(fc; /i) = Akexp{—k/kc) , where A 
is a constant and kc is the characteristic wave-number of 
the "turn-over" scale. Its Fourier transform, the real space 
two-point reduced correlation function, has the following 
behavior: 



KM 



(3) 



where /i is now the value of the normalized density fluc- 
tuation. One may consider the characteristic length scale 
Tzp, such that $,{rzp; n) = 0, i.e. 



' zp 



V3 



(4) 



Other length scales can be defined to be dependent on the 
amplitude of ^(r;/Lt): For example one may identify the 
scale at which ^(r; /x) has a certain (positive) value (which 
in this context has to be smaller than one by definition, as 
this is a continuous Gaussian random density field) and 
thus identifying a length scale which will be dependent 
on the amplitude A. According to Eq|2|the scale r^p is in- 
variant under the biasing scheme discussed in the previous 
section (see Fig^l 

The correlation function given by Eq|31is different from 
the one of a more realistic CDM model in the behavior 
at scales for r < r^p'- In the CDM model in that range 
of scales ^(r; /i) has an approximate power-law behavior 
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Fig. 2. Reduced correlation function for the four samples 
of points selected in the simulation: the original dark mat- 
ter (DM) field, all "galaxies" (ALL), blue galaxies (BLUE) 
and red galaxies (RED). Distances are given in units of the 
sample size L=141.3 Mpc/h. 

non- linear clustering is formed up to scales of order of few 
Mpc (see e.g. Baertschiger, Joyce fc Sylos Labini 2002| ) 
with the introduction of some intrinsic scale r,p is unchanged, as typi- 



Fig. 1. Absolute value of the reduced correlation function 
of the toy model described by Eq|31 (solid line) and of the 
ones corresponding to different values of the threshold pa- 
rameter 1/ calculated by applying Eq^ The amplification 
is non-linear at small scales, where S,{r; fj, > fl) > 1, lin- 
ear at large scales, and the zero-crossing scale is invariant 
under biasing. 



of the type ^(r; ^) 
other characteristic scales. However the zero-crossing scale 
rzp is still a clear intrinsic feature which is not changed by 
the biasing scheme discussed in the previous section. At 
large scales r > rzp the CDM reduced correlation function 
has the same — 
important constraint 



behavior as Eq|3| Both satisfy the 



(5) 



which has been called "the super-homogeneous condi- 
tion" , in order to make clear the fact that this corre- 
sponds to a global condition on the correlation proper- 
ties of particular systems which display a sort of long- 
range order, or, alternatively, they are more ordered than 
purely uncorrelated stochastic processes (e.g. Poisson) 
dGabriein, Joyce fc Sylos Labini 20021 ). 

4. Sampling points in cosmological N-body 
simulations 

Gravitational clustering in the regime of strong fluctua- 
tions is usually studied through gravitational N-body sim- 
ulations. The particles are not meant to describe galaxies 
but collision-less dark-matter mass tracers (but see dis- 
cussion in e.g. [Baertschiger &: Sylos Labini 2004| ). During 
gravitational evolution complex non-linear dynamics make 
non-linear structures at small scales, while at large scales 
it occurs a linear amplification according to linear per- 
turbation theory. Thus, while on large scales correlation 
properties do not change from the beginning — a part a 
simple linear scaling of amplitudes — at small scales non- 
linear correlations are built. Typically in these simulations 



cally Vzp > 50 Mpc/h in CDM models (see e.g. 
IGabrielh, Joyce fc Sylos Labini 20021 ). 

At late times one can identify subsamples of points 
which trace the high density regions, and these would 
represent the "galaxies" whose statistical properties are 
ultimately compared with the ones found in galaxy 
samples. Here I consider the GIF galaxy catalog 
(| Kauffmann et al. 1 999) constructed from a ACDM simu- 
lation run by the Virgo consortium (jJenkins et al. 19'98j) . 
The way in which this is done is to firstly identify the 
halos, which represent almost spherical structures with a 
power-law density profile from their center. The number 
of galaxies belonging to each halo is set proportional to 
the total number of points belonging to the halo to a cer- 
tain power. This procedure identifies points lying in high 
density regions of the dark-matter particles. One may as- 
sign to each point a luminosity and a color on the basis 
of a certain criterion which is not relevant for what fol- 
lows fsee lSheth et al. 200 II and reference therein). The re- 
sulting catalog is divided into two subsamples based on 
"galaxy" color as in Sheth et al. (2001): (brighter) red 
galaxies (for which B-I is redder than 1.8) and (fainter) 
blue galaxies (B-I bluer than 1.8). 

In summary four samples of points may be considered: 
(i) the original dark matter particles with N=256'^ parti- 
cles (ii) all galaxies with N= 15445 (iii) blue galaxies with 
N=11023 and (iv) red galaxies with N=4422. In Fig^the 
behavior of ^(r) for the different objects is shown ^. One 
may notice that ^(r) for red (blue) galaxies has a larger 
(smaller) amplitude than the one of the original sample 
(all galaxies). The underlying dark matter particles show 

^ The estimator of ^(r) is the full-shell one — see Eq|H]below 
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almost the same amplitude as all galaxies, although a 
change of slope at small scales is manifest. The ampli- 
fication is linear, i.e. ^(r) for red galaxies shows almost 
the same functional behavior of that of all galaxies but 
with a larger amplitude. Clearly the original point dis- 
tribution is not Gaussian, at least in the relevant range 
of scales considered, but characterized by strong fluctua- 
tions and thus one should explain such a mechanism of 
amplification (or de-amplification) differently from what 
has been proposed by Kaiser (1984). On the other hand 
the scale where the power-law behavior breaks down, and 
thus the scale ^{rzp) = 0, is invariant under sampling as for 
the simple Gaussian threshold biasing scheme discussed 
above: the amplitude independent characteristic scale is 
not changed under biasing. The biasing mechanism de- 
scribed above does not introduce new length scales in the 
system or change the intrinsic one, but it does alter the 
amplitude of the average density and thus any scale de- 
pendent on it (e.g. the scale such that ^(ro) = 1). 

Note that the zero-crossing scale of ^(r) cannot be in 
general well established because of statistical fluctuations 
which affect any finite sample estimation of correlations. 
In this case however a clear signature of the zero-crossing 
scale is given by the sharp cut-off of the reduced corre- 
lation function, in a log-log plot, at the scale of order 
Tzp- This happens when the amplitude of the estimated 
^(r) is about 10~^, so that statistical noise does not affect 
the measurement in a substantial way. In the case con- 
sidered, in fact, the regime changes from being positively 
correlated, and larger than unity, to small anti-correlation. 
This is the way used hereafter to define the scale Vzp- In 
the general case, where the functional behavior of the cor- 
relation function is more complicated (e.g. with a very 
slow approach to zero) the way the zero-crossing scale is 
estimated must be clearly explored. 

In order to test the reality of the zero-crossing scale, 
one may cut the sample at the scale Rg ~ 0.3 (in units 
normalized to the box side) and recompute the correla- 
tion function. No sensible change is found in the scale r^p. 
As discussed below, this happens because the conditional 
density for scales r > 0.3 is very well approximated by a 
flat behavior corresponding to the transition from strong 
to weak clustering, and the scale Vzp is related, in this 
case, to the scale where the conditional density flattens. 

Note that in the regime where ^(r) 3> 1 no clear a 
priori prediction can be formulated on the amount of in- 
crease of amplitude of £,{r) with sampling. Actually the 
perspective on this problem is to choose a selection pro- 
cedure such that it gives results similar to what is found 
in galaxy catalogs. Thus the observations are used to tune 
the selection in the simulations. The idea is in fact that 
one may change the way points are selected up to when 
a satisfactory agreement with what is observed in galaxy 
catalogs has been found. This can be true for the strongly 
correlated regime, but the selection employed does not 
change Vzp which thus becomes the main length scale to 
be studied when relating observed galaxy distributions to 
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Fig. 3. Conditional density for the samples of points 
shown in Fig|21 The conditional density for dark matter 
particles (DM) has been normalized arbitrarily. The ref- 
erence line has a slope 7 = 1.7 

simulations and ultimately to the distribution of the un- 
derlying dark matter particles. 

In order to understand the origin of the amplification 
observed in the sampled point distributions it is useful to 
study the behavior of the conditional density which has 
a straightforward interpretation in terms of correlations 
(see FigEl . This statistical tool gives the average number 
of points observed in an infinitesimal shell as a function of 
distance from a point of the distribution (and thus this is a 
conditional quantity) and can be written as (e.g. Gabrielli 
et al. 2004) 



(n(r)n(O)) 
("(0)) 



(6) 



where n(r) is the microscopic particle number density. 
This is related to ^(r) by the equation 



{n{r))p 
no 



1 



(7) 



being uq > the ensemble average density of the distri- 
bution. 

The red galaxies are responsible for the strong correla- 
tions observed in the full sample as the conditional density 
is almost the same as for all galaxies at small scales. At 
large scales there is instead a fast decrease as the sam- 
ple average of red galaxies is smaller than the one of all 
galaxies (there are less objects). The amplification of ^(r) 
of the red galaxies with respect to the full sample can be 
explained as an almost constant value of the conditional 
density at small scales together with a decrease of the 
sample density. It follows from Eq[7| that the amplitude 
of ^(r) is amplified if {n{r))p remains the same and no 
is lowered. This means that for red galaxies the sampling 
is local, i.e. their conditional density is (almost) invariant 
at small scales. Clearly, as there are globally less objects, 
the sample density of red galaxies is smaller than that 
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of all galaxies. On the other hand blue galaxies present 
only some residual correlations a small scales, and they 
are more numerous than red galaxies. 

The main conclusion is that the intrinsic characteristic 
length of the model given by Vzp (measured as discussed 
above) is not changed by this selection procedure, in close 
analogy with what happens in the simple Gaussian thresh- 
olding biasing scheme of a CDM field discussed in the pre- 
vious section. 

5. Finite size effects and the integral constraint 

Concerning the study of the zero-crossing scale of ^(r) a 
point must be clarified in relation to the estimator of this 
statistical quantity. Suppose that one chooses the so-called 
full-shell estimator IjGabrielli et al. 2004|l defined as 



riE 



1 



(8) 



where n^; is the density in a sphere of radius Rg up to 
which {n{r))p^ can be estimated. As the sample density 
is estimated by 



it follows that 



Rs 



^E{r)r^dr = 



(9) 



(10) 



This condition holds independently on Rs and the true 
^(r): Thus in a finite sample one finds the zero crossing of 
^£:(r) no matter which are the true correlation properties 
of the distribution^. For example ^(r) can be a simple 
positive power-law extending to scales much larger than 
Rs'. its estimator in a finite sample will obey to Eq^| The 
point to study is whether the zero-crossing scale depends, 
or not, on the sample volume. 

In the case of a CDM-like correlation function, where 
a similar constraint holds in the whole space (see Eq|SJl 
one can distinguish between the following behaviors (for 
simplicity, we neglect in the following discussion the effect 
of statistical noise in the estimator): (i) Rs < r^p — in this 
case the positively correlated range of scales at small scales 
will not be detected entirely, but an artificial zero-point 
will be introduced at scales comparable to Rs . In addition 
the amplitude of the estimator ^£;(r) is scale dependent. 



(n) Rs > Tz 



in this case the zero crossing scale will be 



well-defined, in the sense that changing Rg the distance 
scale r^p will not change. Hoverer the negative correlated 



) will be distorted (and the ab- 



range of scales (i.e. r > r^p 
solute value of t,E ij) is increased) by the condition EallOl 
(see FigEll. 

2 Note that Eq[Tni holds only for the full- 
shell estimator of C(^)- However, as discussed in 
UGabrieUi, Joyce fc Sylos Labini 20021 IGabrielh et al. 2004|l 
similar boundary conditions, related to the fact that the 
average density has been estimated inside a given sample, 
must be verified by any estimator of ^(r). 




Fig. 4. Estimation (E), through the full-shell estimator 
EqlHl of the theoretical (absolute value) ^ (r) (T) given by 
EqOl In this case Rs = 500 > rzp = 60. One may note 
that the negative tail is distorted in a non-linear way in 
order to satisfy EatTHI 

A similar situation happens when (n(r))p has a power- 
law behavior inside a given sample of size Rs. (Note 
that the following argument can be simply modified to 
any other functional behavior of {n{r))p in the regime 
where {n{r))p > no). Suppose then that the scale where 
(n(r))p w no is larger than Rs and that 



(11) 



where 3 > 7 > 0. Neglecting fluctuations, the estimation 
of the sample density from Eq|5| becomes 



HE 



w 

3-7 



R7'' 



(12) 



so that the estimation of i^(r) can be written as (again, 
neglecting fluctuations) 



(13) 



In this case both the scales at which ^(r) — 1,0 are linearly 
dependent on the sample size Rg. 

Note that the estimation in Eq^] has been done by 
assuming that one can perform a volume average also at 
the scale of the sample: this means that one has made an 
average over different samples of size Rs- In case this is 
not possible (i.e. the usual situation in galaxy catalogs) 
significant deviation from the estimation given by Eas ll2t 
[Ql can be found (see Xiabriclli et al. 2004 for a detailed 
explanation of this point). 

It is worth noticing that while statistical noise may 
change the scale where C(^zp) = 0, it does not change 
the fact that such a scale depends on the sample size as 
long as the conditional density has not become constant 
as a function of scale. However one should note that for 
a functional behavior of the type strong power-law corre- 
lations followed by a regime where ^(r) is very small (or 
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zero or negative as in the CDM case) the scale r^p can be 
easily identified by the scale where a sharp break down 
from a power law behavior is manifest, which corresponds 
to the scale where {n{r))p « uq. This is actually the way 
in which the constraint imposed by Eg 1101 is evident. The 
situation where the scale Vzp corresponds to a real feature, 
i.e. Rs ^ Tjp, is much more problematic to be measured 
and it requires a very careful analysis of the estimator er- 
rors. For example the detection of very small amplitude 
correlations can be masked, at least, by Poisson noise go- 
ing as 1/ Vn. 

6. Comparison with observations 

The characterization of galaxy clustering is usually per- 
formed through the study of the reduced two-point cor- 
relation function. The result found in various galaxy 
catalogs is that Ri A x r^^ when > 1 

with 7 = 1.7 and A is a constant which takes dif- 
ferent values in different volume limited (hereafter VL) 
subsamples Ce.g. lDavis fc Peebles 19831 IDavis et al. 19881 
INorberg et al. 2002|IZehavi et al. 2004A|I . 

One should note that a VL is constructed in such a 
way to contain all galaxies brighter than a certain abso- 
lute magnitude threshold and it is limited by a distance 
depending on the apparent magnitude limit of the galaxy 
catalog and on the absolute magnitude threshold consid- 
ered fe.g. . IDavis fc Peebles 1983)l . This impHes that a VL 
sample is identified (at least) by two cuts, one in the dis- 
tance RvL and one in the corresponding absolute magni- 
tude MvL, the relation between the two being (at small 
redshift, neglecting corrections) 

MvL = mivm ~ 5 \og^„ RvL - 25 (14) 

where mum is the apparent magnitude limit of the con- 
sidered galaxy survey and Rvl is measured in Mpc/h. 
Thus, when one increases Rvl only galaxies with brighter 
absolute luminosity (decreasing absolute magnitude M < 
Mvl) are included in the sample. (In latest surveys like 
SDSS and 2dF, there are two cuts in apparent magnitude, 
and thus a VL is identified by two cuts in absolute magni- 
tude and two in distance: this complicates the estimation 
of the depth of the samples but does not introduce a sub- 
stantial change in the following discussion). 

Given the two parameters Ry l , My l defining a VL 
sample, one may consider (at least) two different effects 
which may cause the amplification of ^£;(r): (i) a luminos- 
ity (or sampling) effect related to the selection of different 
class of objects in different VL samples^; (ii) a finite-size 
effect related to the change of the volume of the samples 
considered when the absolute magnitude cut is changed. 
In other words the variation of the amplitude of [r) can 

^ A similar effect happens when the selection is done on the 
basis of galaxy color (e.g. IZehavi et al. 2nn4Afl . As there is a 
correlation between galaxy color and luminosity, this adds a 
complication but no essential change to the logic of our argu- 
ment. 
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Fig. 5. Amplification of £,e{t) due to a sampling proce- 
dure similar to what is found in N-body simulation. The 
underlying particle distribution (1) and the selected ones 
(2) have a different amplitude in the regime of strong clus- 
tering but show the same zero-point crossing scale in the 
reduced correlation function which has a power-law decay 
up to a definite scale. In the insert panel it is shown the 
corresponding behavior of the conditional density. Note 
that the scale r^p coincides with the scale where {n{r))p « 
constant. 

be related to a samphng effect (e.g. to Eq^J or to a vol- 
ume effect (e.g. to Ea ll3f) : The situation is illustrated in 
FigsEEl 

Note that the variation of the amplitude of ^£;(r) 
(or of its Fourier conjugate the power spectrum) 
is usually (e.g. IDavis fc Peebles 19831 IDavis et al. 19881 
INorberg et al. 2002| IZehavi et al. 2004A|I ascribed to the 
fact that galaxies of different luminosity are differ- 
ently clustered in the sense that brighter galaxies 
have a larger amplitude than fainter ones (this is 
usually called "luminosity bias"): i.e. A = A{Mvl) 
is an increasing function of the absolute luminosity 
Mvl of the considered galaxies (e.g. [Norberg et al. 2002| 
IZehavi et al. 2flfl4A|l . Also for galaxy clusters a similar 
variation in the amplitude, although larger, has been 
found fe.g. lBahcall fc Soneira 1983|l where the variation is 
ascribed to the richness of the clusters considered. In brief 
this variation is ascribed to some specific ways of sampling 
the (galaxy or cluster) point distribution^. If this would be 
the case than one should find, for the zero-crossing length 
scale rzp a situation analogous to the one shown in Fig|Sl 
i.e. this scale should be the same for different objects. 

Thus in order to distinguish between the two differ- 
ent mechanics of amplification of ^(r) one has an indirect 
and a direct test. The former consists in the study of the 
stability of the zero-crossing length scale in different sam- 

^ Note that ( Zeh avi et al. 20 04A ) made some specific mea- 
surements able to test for finite-size effects, with the result that 
large sample fluctuations do alter the amplitude of (,E(r). A 
discussion of these results can be found in ( [Joyce et al. 2005| l. 
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Fig. 6. In the case the distribution has a conditional den- 
sity with a power-law behavior up to the sample size (1), 
then the amplitude of (r) and its zero-crossing scale de- 
pends linearly on the sample size (2) if the sample size is 
larger than the scale where {n(r))p has a clear flattening 
toward a constant value. In the insert panel it is shown 
the corresponding behavior of the conditional density: in 
this case the two lines coincide, the case (2) extending to 
larger scales. 

pies, while the latter is represented by the determination 
of the conditional density in VL samples. As the condi- 
tional density is not usually estimated (e.g. Zehavi et al. 
2004B, Norberg et al., 2002) I need to consider also the 
stability of rzp. Below I will comment about the relation 
with the measurements of {n{r))p recently performed by 
Hogg et al. (2005) and the various determinations sum- 
marized in Gabrielli et al. (2004). 

It is interesting to briefly review some determina- 
tions in redshift space of the scale r^p'- in the CfAl sam- 
ple r^p « 20 Mpc/h (Davis & Peebles 1983 ); Park et al. 
(1994) found, in the CfA2 catalog, a larger value of about 
r^p « 30 Mpc/h (see their Fig. 10) and Benoist et al. (1996) 
found that r^p is not stable in different VL samples of the 
SSRS2 survey, changing from 10 to about 50 Mpc/h (see 
their Fig.l). More recently it has been found that r^p « 40 
Mpc/h in the Two degree Field Galaxy Redshift Survey 
IjHawkins et al. 2003 |l . The latest determination has been 
performed by Eisenstein et al. (2005) by considering the 
Luminous Red Galaxies sample from the SDSS. This sam- 
ple covers the largest volume of universe up to now. In 
Eisenstein et al. (2005) the zero-point of the correlation 
is found to be at a scale of about 120 Mpc/h (see their 
Figs. 2-3). Thus it seems that, up to now, in galaxy sam- 
ples, the length scale r^p is related to the length scale tq 
(defined as ^(^q) = 1): they are both sample size depen- 
dent. Whether the latest measurement by Eisenstein et al. 
(2005) is stable will have to be shown by the analysis in 
larger samples. 

Note that, as discussed above, one of the main char- 
acteristic of the selection mechanisms usually considered 



is that the zero-crossing scale of ^(r) is invariant under 
sampling. Thus even if one uses a very particular kind of 
objects, results on the zero-crossing scale have to be the 
same for any other kind of objects if the difference in the 
correlation function (or power spectrum) are explained by 
a selection effect similar to what is found in the N-Body 
simulations. If the zero-crossing scale is instead not found 
to be stable in different samples and thus for different ob- 
jects, this is a clear indication that correlation properties 
are finite-size dependent in the sense of FigEl 

The direct test (corresponding to the insert panels in 
Figs l5l6|l for this has been implicitly performed by Hogg 
ct al. (2005) where they measured the (integrated) condi- 
tional density for the same Luminous Red Galaxies sam- 
ple considered by Eisenstein et al (2005). They in fact find 
that the conditional density, having a power law behavior 
with exponent 7 « 1 up to 20 30 Mpc/h, shows a slow 
crossover toward homogeneity, reaching a constant value 
at about 70 Mpc/h. These results support the conclusions 
drawn here, that the zero-point crossing scales found in 
previous and smaller volume surveys is a finite size effect. 
The results by Hogg et al. (2005) are then in agreement 
with those of Eisenstein et al. (2005): here we note that 
the fiattening of {n{r))p occurs at scales comparable to 
the sample size and thus this situation requires a care- 
ful study of larger samples to confirm these results over a 
substantial range of scales (see discussion in Joyce et al. 
2005). 

7. Conclusions 

The study of the dependence of the zero-crossing scale as a 
function of the size of a given sample is already a vailable 
test to distinguish between the different effects produc- 
ing the variation of the amplitude of ^(r). As long as it 
is found to be dependent on the finite sample size, this 
means that all amplitudes related to ^(r) are also finite 
size dependent. In such a situation a more clear way to 
study the problem is represented by the analysis of the 
conditional density {n{r))p (see e.g. lGabrielli et al. 2004|l . 
From a review of the literature it seems that the scale 
Tzp has grown from 20 Mpc/h in the CfAl sample 
(|Davis fc Peeb les 1983'^ to about 120 Mpc/h in the lat- 
est SDSS data (Eisenstein ct al. 2005). Analogously the 
scale ro (defined as ^(rg) = 1) has grown from about 5 
Mpc/h in the CfAl l|Davis fc Pe ebles 19831 to about 13 
Mpc/h in the SDSS sample ifZetiavi ct al. 2004B). 

This implies that the explanation of the amplitude 
variation of ^(r) by luminosity bias (brighter objects have 
larger amplitudes) is untenable. Such a variation can be 
instead explained as a finite size effect. To directly test this 
fact one may simply measure the conditional density and 
results for this quantity (Sylos Labini et al. 1998, Hogg et 
al., 2005) unambiguously support the fact the the ampli- 
tude variation of £,{r), or of its zero-crossing length, are 
finite-size effects (see Figs l5l6|l . This situation implies that 
rp is sample size dependent up to the scale where {n{r))p 
has a clear crossover. If one considers such a scale to be 70 
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Mpc/h, as suggested by Hogg et al. (2005), then ro ~ 13 
Mpc/h for galaxies of any luminosity. Note that the pre- 
diction of Ea ll3l does not apply in this situation as the 
conditional density measured by Hogg et al. (2005) shows 
two different behaviors in the strongly clustering regime: a 
simple power-law up to about 20Mpc/h a a slow crossover 
up to 70 Mpc/h. In this situation the estimation of tq has 
to be done numerically. 

The difference between the zero-crossing length scales, 
found by the sharp cut-off in a log-log plot of the cor- 
relation function, in the galaxy catalogs extracted from 
N-body simulations (which is about 30 Mpc/h) and the 
one detected by l|Eisenstein et al. 2 0051 for the largest ob- 
servational sample of the SDSS available up to now, is of 
about a factor five. In the situation considered here the 
zero-point of ^(r) is the scale where {n{r))p fa hq and thus 
this is related to the size of the largest non-linear structure 
in the distribution. This implies that structures formed in 
N-body simulations are smaller than galaxy structures. 
This can be directly tested by comparing the scale where 
(n(r))p « const, in simulations and in galaxy samples (see 
discussion in Joyce et al., 2005). 

It is important to stress that the conditional density 
in N-body simulations (see FigEJ has a slope of about 
7 = —1.7 while in galaxy catalogs Sylos Labini et al. 
(1998) and Hogg et al. (2005) have measured 7 = 1. While 
the analysis of ^(r) does not give a clear determination of 
the slope 7, as it is affected by a finite size effect when 
(n(r))p is a power-law, the analysis of the conditional den- 
sity provides with a clear result (see discussion in Gabrielli 
et al. 2005). In other words, while the comparison of ^(r) 
in simulations and galaxy samples can be misleading, this 
is not the case for {n{r))p. 

One should also note that in N-body simulations the 
slope is determined in real space, while in results in galaxy 
catalogs considered here are in redshift space. However the 
scales involved (some tens Mpc/h where peculiar veloci- 
ties are expected to be small) and the large difference in 
the slopes found (about 0.7) point toward a real difference 
between structures formed in N-body simulations and ob- 
served in galaxy catalogs. 

It is worth noticing that the scale r^p, in CDM models, 
is simply related to the so-called turn-over wave-number 
of the power spectrum, i.e. where the power spectrum 
changes regime from negative to positive power law. In this 
respect, I note that for the determination of the power- 
spectrum of density fluctuations a finite size effect in the 
amplitude and in the location of the turn-over scale, in a 
similar way to what happens for ^(r), is expected to be 
present as long as the distribution has strong clustering 
inside a given sample ( |Sylos Labini fc Amendo la 1996). 
Such a situation allows one to simply relate the results 
of Tegmark et al. (2004) for the power spectrum in the 
SDSS survey, to the results obtained by the real space 
correlation function analysis by Zehavi et al. (2004B). 
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